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ABSTRACT: Traditional method of implementing FIR filters costs considerable hardware resour ses, 
which goes against the decrease of circuit scale and the increase of system speed. A new design and 
implementation of FIR filters using Distributed Arithmetic is provided in this paper to slove this 
problem. Distributed Arithmetic structure is used to increase the re sour se useage while pipeline 
structure is also used to increase the system speed. In addition, the devided LUT method is also used to 
decrease the required memory units. The simulation results indicate that FIR filters using Distributed 
Arithmetic can work stable with high speed and can save almost 50 percent hardware resourses to 
decrease the circuit scale, and can be applied to a variety of areas for its great flexibility and high 
reliability 

Keywords - About five key words in alphabetical order, separated by comma 


I. Introduction 

Digital filters are the essential units for digital signal processing systems. Traditionally, digital filters 
are achieved in Digital Signal Processor (DSP), but DSP-based solution cannot meet the high speed 
requirements in some applications for its sequential structure. Nowadays, Field Programmable Gate Array 
(FPGA) technology is widely used in digital signal processing area because FPGA-based solution can achieve 
high speed due to its parallel structure and configurable logic, which provides great flexibility and high 
reliability in the course of design and later maintenance. In general, Digital filters are divided into two 
categories, including Finite Impulse Response (FIR) and Infinite Impulse Response(IIR). And FIR filters are 
widely applied to a variety of digital signal processing areas for the virtues of providing linear phase and system 
stability. 

The FPGA-based FIR filters using traditional direct arithmetic costs considerable multiply-and- 
accumulate (MAC) blocks with the augment of the filter order. However, according to Distributed Arithmetic, 
we can make a Look-Up-Table (LUT) to conserve the MAC values and callout the values according to the input 
data if necessary. Therefore, LUT can be created to take the place of MAC units so as to save the hardware 
resources. This paper provide the principles of Distributed Arithmetic, and introduce it into the FIR filters 
design, and then presents a31 -order FIR low-pass filter using Distributed Arithmetic, which save considerable 
MAC blocks to decrease the circuit scale, meanwhile, devided LUT method is used to decrease the required 
memory units and pipeline structure is also used to increase the system speed. 


II. Distributed Arithmetic 

Distributed Arithmetic was first brought up by Croisier [1] and was extended to cover the signed data 
system by Liu, and then was introduced into FPGA design to save MAC blocks with the development of FPGA 
technology. 

The N-length FIR filter can be described as: 

y=<h,x>=£/i[/iN«] (1) 

Where h[n] is the filter coefficient and x[n] is the input sequence to be processed. The FIR structures 
consists of a series of multiplication and addition units, and consume N MAC blocks of FPGA, which are 
expensive in high speed system. Compared with traditional direct arithmetic, Distributed Arithmetic can save 
considerable hardware resources through using LUT to take the place of MAC units [2]. Another virtue of this 
method is that it can avoid system speed decrease with the increase of the input data bit width or the filter 
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coefficient bit width, which can occur in traditional direct method and consume considerable hardware resources 
[3] 
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Fig. 1 The basic Distributed Arithmetic structure 


III. Filters Design 

In the course of FIR filters design, ringing can be generated at the edge of transition band for the reason 
that finite series Fourier transform cannot produce sharp edges [5]. So windows are often used to produce 
suitable transition band, and Kaiser Window is widely used for providing good performance. The parameter b 
an important coefficient of Kaiser Window 

However, with the increase of filter order, the scale of LUT will increase dramatically, which will cost 
more time to look up the table and more memory to store the values. Therefore, we can 


Tab.2 Coefficient values of LUT 



Data 

0000 

0 

0001 

H[0] 

0010 

Ml] 

0011 

h[0]+h[l] 

0100 

h[2] 

0101 

h[0]+ h[2] 

01 10 

h[l]+h[2] 

01 11 

h[0] + h[l]+ H[2] 

1000 

h[3] 

II 001 

h[0]+ h[3] 

1010 

h[l]+h[3] 

1011 

h[0]+ h[l]+ h[3] 

1 100 

h[2]+ h[3] 

1 101 

h[0]+ h[2]+ h[3] 

] 1 10 

H[l]+ h[2]+ h[3] 

0 111 

hroi+ hm+ hf 21+ hpi 


Pipeline structure is also used to increase the system speed. The pipelining technology is to dividecombinational 
circuit into small parts, and then inserts a register in the middle of the two parts to increase the system speed [9]. 
The filter designed in this paper 

Contains 3 level registers. Although it will increase the time delay, but helps to increase the system speed [10] 
Considering all the factors above, we achieve the new structure based on Distributed Arithmetic as Fig. 3 
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IV. Conclusion 

This paper presents the design and implementation based on Distributed Arithmetic, which is used to 
realize a 31 -order FIR low-pass filter. Distributed Arithmetic structure is used to increase the resourse usage 
while pipeline structure is used to increase the system speed. The test results indicate that the designed filter 
using Distributed Arithmetic can work stable with high speed and can save almost 50 percent hardware 
resourses. Meanwhile, it is very easy to transplanted the filter to other applications through modifying the order 
parameter or bit width and other parameters, and therefore have great practical applications in digit signal 
processing 


V. Result 
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